AITopics | attacker and defender

Collaborating Authors

attacker and defender

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AEGIS : Automated Co-Evolutionary Framework for Guarding Prompt Injections Schema

Liu, Ting-Chun, Hsu, Ching-Yu, Lee, Kuan-Yi, Fu, Chi-An, Lee, Hung-yi

arXiv.org Artificial IntelligenceOct-10-2025

Prompt injection attacks pose a significant challenge to the safe deployment of Large Language Models (LLMs) in real-world applications. While prompt-based detection offers a lightweight and interpretable defense strategy, its effectiveness has been hindered by the need for manual prompt engineering. To address this issue, we propose AEGIS , an Automated co-Evolutionary framework for Guarding prompt Injections Schema. Both attack and defense prompts are iteratively optimized against each other using a gradient-like natural language prompt optimization technique. This framework enables both attackers and defenders to autonomously evolve via a Textual Gradient Optimization (TGO) module, leveraging feedback from an LLM-guided evaluation loop. We evaluate our system on a real-world assignment grading dataset of prompt injection attacks and demonstrate that our method consistently outperforms existing baselines, achieving superior robustness in both attack success and detection. Specifically, the attack success rate (ASR) reaches 1.0, representing an improvement of 0.26 over the baseline. For detection, the true positive rate (TPR) improves by 0.23 compared to the previous best work, reaching 0.84, and the true negative rate (TNR) remains comparable at 0.89. Ablation studies confirm the importance of co-evolution, gradient buffering, and multi-objective optimization. We also confirm that this framework is effective in different LLMs. Our results highlight the promise of adversarial training as a scalable and effective approach for guarding prompt injections.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.00088

Genre: Research Report > New Finding (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

AdvEvo-MARL: Shaping Internalized Safety through Adversarial Co-Evolution in Multi-Agent Reinforcement Learning

Pan, Zhenyu, Zhang, Yiting, Liu, Zhuo, Tang, Yolo Yunlong, Zhang, Zeliang, Luo, Haozheng, Han, Yuwei, Zhang, Jianshu, Wu, Dennis, Chen, Hong-Yu, Lu, Haoran, Fang, Haoyang, Li, Manling, Xu, Chenliang, Yu, Philip S., Liu, Han

arXiv.org Artificial IntelligenceOct-3-2025

LLM-based multi-agent systems excel at planning, tool use, and role coordination, but their openness and interaction complexity also expose them to jailbreak, prompt-injection, and adversarial collaboration. Existing defenses fall into two lines: (i) self-verification that asks each agent to pre-filter unsafe instructions before execution, and (ii) external guard modules that police behaviors. The former often underperforms because a standalone agent lacks sufficient capacity to detect cross-agent unsafe chains and delegation-induced risks; the latter increases system overhead and creates a single-point-of-failure-once compromised, system-wide safety collapses, and adding more guards worsens cost and complexity. To solve these challenges, we propose AdvEvo-MARL, a co-evolutionary multi-agent reinforcement learning framework that internalizes safety into task agents. Rather than relying on external guards, AdvEvo-MARL jointly optimizes attackers (which synthesize evolving jailbreak prompts) and defenders (task agents trained to both accomplish their duties and resist attacks) in adversarial learning environments. To stabilize learning and foster cooperation, we introduce a public baseline for advantage estimation: agents within the same functional group share a group-level mean-return baseline, enabling lower-variance updates and stronger intra-group coordination. Across representative attack scenarios, AdvEvo-MARL consistently keeps attack-success rate (ASR) below 20%, whereas baselines reach up to 38.33%, while preserving-and sometimes improving-task accuracy (up to +3.67% on reasoning tasks). These results show that safety and utility can be jointly improved without relying on extra guard agents or added system overhead.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2510.01586

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

What Makes a Dribble Successful? Insights From 3D Pose Tracking Data

Schepers, Michiel, Robberechts, Pieter, Van Haaren, Jan, Davis, Jesse

arXiv.org Artificial IntelligenceJul-1-2025

Data analysis plays an increasingly important role in soccer, offering new ways to evaluate individual and team performance. One specific application is the evaluation of dribbles: one-on-one situations where an attacker attempts to bypass a defender with the ball. While previous research has primarily relied on 2D positional tracking data, this fails to capture aspects like balance, orientation, and ball control, limiting the depth of current insights. This study explores how pose tracking data-- capturing players' posture and movement in three dimensions--can improve our understanding of dribbling skills. We extract novel pose-based features from 1,736 dribbles in the 2022/23 Champions League season and evaluate their impact on dribble success. Our results indicate that features capturing the attacker's balance and the alignment of the orientation between the attacker and defender are informative for predicting dribble success. Incorporating these pose-based features on top of features derived from traditional 2D positional data leads to a measurable improvement in model performance.

data mining, dribble, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.22503

Country: Europe > Belgium (0.15)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.69)

Add feedback

Embedded Mean Field Reinforcement Learning for Perimeter-defense Game

Wang, Li, Yu, Xin, Lv, Xuxin, Ai, Gangzheng, Wu, Wenjun

arXiv.org Artificial IntelligenceMay-21-2025

With the rapid advancement of unmanned aerial vehicles (UAVs) and missile technologies, perimeter-defense game between attackers and defenders for the protection of critical regions have become increasingly complex and strategically significant across a wide range of domains. However, existing studies predominantly focus on small-scale, simplified two-dimensional scenarios, often overlooking realistic environmental perturbations, motion dynamics, and inherent heterogeneity--factors that pose substantial challenges to real-world applicability. To bridge this gap, we investigate large-scale heterogeneous perimeter-defense game in a three-dimensional setting, incorporating realistic elements such as motion dynamics and wind fields. We derive the Nash equilibrium strategies for both attackers and defenders, characterize the victory regions, and validate our theoretical findings through extensive simulations. To tackle large-scale heterogeneous control challenges in defense strategies, we propose an Embedded Mean-Field Actor-Critic (EMFAC) framework. EMFAC leverages representation learning to enable high-level action aggregation in a mean-field manner, supporting scalable coordination among defenders. Furthermore, we introduce a lightweight agent-level attention mechanism based on reward representation, which selectively filters observations and mean-field information to enhance decision-making efficiency and accelerate convergence in large-scale tasks. Extensive simulations across varying scales demonstrate the effectiveness and adaptability of EMFAC, which outperforms established baselines in both convergence speed and overall performance. To further validate practicality, we test EMFAC in small-scale real-world experiments and conduct detailed analyses, offering deeper insights into the framework's effectiveness in complex scenarios.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2505.14209

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Government > Military (1.00)
Education (0.68)
Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Agentic AI and the Cyber Arms Race

Oesch, Sean, Hutchins, Jack, Austria, Phillipe, Chaulagain, Amul

arXiv.org Artificial IntelligenceFeb-10-2025

Abstract---Agentic AI is shifting the cybersecurity landscape as attackers and defenders leverage AI agents to augment humans and automate common tasks. In this article, we examine the implications for cyber warfare and global politics as Agentic AI becomes more powerful and enables the broad proliferation of capabilities only available to the most well resourced actors today . As attacks increased in volume and attackers became more sophisticated, moving towards polymorphic malware, packers, and novel evasion techniques, defenders looked to machine learning to provide scalability (quickly analyze large volumes of data and automate repetitive tasks), pattern recognition (detect common attack patterns), and novelty detection (recognize abnormal behaviors that may indicate malicious actors or insider threats). Companies now use Large Language Models (LLMs) to provide analysts and reverse engineers with a rapid analysis of malicious code and best next steps when triaging alerts. But the real paradigm shift in cybersecurity for both attackers and defenders is still on the horizon: agentic artificial intelligence (agentic AI).

agent, agentic ai, oak ridge national laboratory, (11 more...)

arXiv.org Artificial Intelligence

2503.0476

Country:

Europe > Austria (0.05)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
Asia > China (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.92)
Government > Regional Government > North America Government > United States Government (0.75)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Towards Type Agnostic Cyber Defense Agents

Galinkin, Erick, Pountrourakis, Emmanouil, Mancoridis, Spiros

arXiv.org Artificial IntelligenceDec-2-2024

With computing now ubiquitous across government, industry, and education, cybersecurity has become a critical component for every organization on the planet. Due to this ubiquity of computing, cyber threats have continued to grow year over year, leading to labor shortages and a skills gap in cybersecurity. As a result, many cybersecurity product vendors and security organizations have looked to artificial intelligence to shore up their defenses. This work considers how to characterize attackers and defenders in one approach to the automation of cyber defense -- the application of reinforcement learning. Specifically, we characterize the types of attackers and defenders in the sense of Bayesian games and, using reinforcement learning, derive empirical findings about how to best train agents that defend against multiple types of attackers.

defender, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2412.01542

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Maryland > Howard County > Hanover (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Ares: A System-Oriented Wargame Framework for Adversarial ML

Ahmed, Farhan, Vaishnavi, Pratik, Eykholt, Kevin, Rahmati, Amir

arXiv.org Artificial IntelligenceOct-24-2022

Since the discovery of adversarial attacks against machine learning models nearly a decade ago, research on adversarial machine learning has rapidly evolved into an eternal war between defenders, who seek to increase the robustness of ML models against adversarial attacks, and adversaries, who seek to develop better attacks capable of weakening or defeating these defenses. This domain, however, has found little buy-in from ML practitioners, who are neither overtly concerned about these attacks affecting their systems in the real world nor are willing to trade off the accuracy of their models in pursuit of robustness against these attacks. In this paper, we motivate the design and implementation of Ares, an evaluation framework for adversarial ML that allows researchers to explore attacks and defenses in a realistic wargame-like environment. Ares frames the conflict between the attacker and defender as two agents in a reinforcement learning environment with opposing objectives. This allows the introduction of system-level evaluation metrics such as time to failure and evaluation of complex strategies such as moving target defenses. We provide the results of our initial exploration involving a white-box attacker against an adversarially trained defender.

artificial intelligence, attacker, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2210.12952

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

AI & ML Cybersecurity: The Latest Battleground for Attackers & Defenders

#artificialintelligenceFeb-13-2022, 00:46:07 GMT

Machine learning (ML) and artificial intelligence (AI) have emerged as critical tools for dealing with the ever-growing volume and complexity of cybersecurity threats. Machines can recognize patterns to detect malware and unusual activity better than humans and classic software. The technology also predicts potential attacks and automatically responds to threats by identifying specific trends and cycles. Indeed, it's not uncommon to have similar incidents that generally require the same response. Instead of repeating the same procedure, sometimes manually, the system can detect the attack, report and categorize the incident, and then apply the fix automatically.

algorithm, learning, threat, (10 more...)

#artificialintelligence

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.95)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.40)

Add feedback

Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS

de Witt, Christian Schroeder, Huang, Yongchao, Torr, Philip H. S., Strohmeier, Martin

arXiv.org Artificial IntelligenceNov-23-2021

Cyber attacks are increasing in volume, frequency, and complexity. In response, the security community is looking toward fully automating cyber defense systems using machine learning. However, so far the resultant effects on the coevolutionary dynamics of attackers and defenders have not been examined. In this whitepaper, we hypothesise that increased automation on both sides will accelerate the coevolutionary cycle, thus begging the question of whether there are any resultant fixed points, and how they are characterised. Working within the threat model of Locked Shields, Europe's largest cyberdefense exercise, we study blackbox adversarial attacks on network classifiers. Given already existing attack capabilities, we question the utility of optimal evasion attack frameworks based on minimal evasion distances. Instead, we suggest a novel reinforcement learning setting that can be used to efficiently generate arbitrary adversarial perturbations. We then argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions, and introduce a temporally extended multi-agent reinforcement learning framework in which the resultant dynamics can be studied. We hypothesise that one plausible fixed point of AI-NIDS may be a scenario where the defense strategy relies heavily on whitelisted feature flow subspaces. Finally, we demonstrate that a continual learning approach is required to study attacker-defender dynamics in temporally extended general-sum games.

arxiv, classifier, perturbation, (14 more...)

arXiv.org Artificial Intelligence

2111.12197

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (0.42)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Physically-interpretable classification of network dynamics for complex collective motions

Fujii, Keisuke, Takeishi, Naoya, Hojo, Motokazu, Inaba, Yuki, Kawahara, Yoshinobu

arXiv.org Machine LearningMay-13-2019

Understanding complex network dynamics is a fundamental issue in various scientific and engineering fields. Network theory is capable of revealing the relationship between elements and their propagation; however, for complex collective motions, the network properties often transiently and complexly change. A fundamental question addressed here pertains to the classification of collective motion network based on physically-interpretable dynamical properties. Here we apply a data-driven spectral analysis called graph dynamic mode decomposition, which obtains the dynamical properties for collective motion classification. Using a ballgame as an example, we classified the strategic collective motions in different global behaviours and discovered that, in addition to the physical properties, the contextual node information was critical for classification. Furthermore, we discovered the label-specific stronger spectra in the relationship among the nearest agents, providing physical and semantic interpretations. Our approach contributes to the understanding of complex networks involving collective motions from the perspective of nonlinear dynamical systems.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

1905.04859

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback